Improving Legal Document Summarization Using Graphical Models

نویسندگان

  • M. Saravanan
  • Balaraman Ravindran
  • S. Raman
چکیده

In this paper, we propose a novel idea for applying probabilistic graphical models for automatic text summarization task related to a legal domain. Identification of rhetorical roles present in the sentences of a legal document is the important text mining process involved in this task. A Conditional Random Field (CRF) is applied to segment a given legal document into seven labeled components and each label represents the appropriate rhetorical roles. Feature sets with varying characteristics are employed in order to provide significant improvements in CRFs performance. Our system is then enriched by the application of a term distribution model with structured domain knowledge to extract key sentences related to rhetorical categories. The final structured summary has been observed to be closest to 80% accuracy level to the ideal summary generated by experts in the area.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Coherent Biomedical Literature Clustering and Summarization Approach Through Ontology-Enriched Graphical Representations

In this paper, we introduce a coherent biomedical literature clustering and summarization approach that employs a graphical representation method for text using a biomedical ontology. The key of the approach is to construct document cluster models as semantic chunks capturing the core semantic relationships in the ontology-enriched scale-free graphical representation of documents. These documen...

متن کامل

The Use of Thematic structure and Concept Identification for Legal Text Summarization

LetSum is a summarization system developed for producing short summaries for legal decisions. LetSum is built with an approach based on the exploration of the document structure and thematic segmentation in order to produce a table-style summary for improving coherency and readability of the text. We present the components of the system and its implementation.

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Mining and its Application in Biomedical Domain

Semantic Text Mining and its Application in Biomedical Domain Illhoi Yoo Xiaohua Hu, Ph.D A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, ...

متن کامل

Building a Trainable Multi-document Summarizer

This paper describes an approach to building a trainable multi-document summarization system, using a simple training process based on support vector machines. The summarization system is trained and tested using the DUC 2005 data set. The evaluation results based on ROUGE scores are presented and methods for improving the performance of the summarization system are identified.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006